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COMBINED  LINEAR  AND  NONLINEAR  MODELING  OF  DATA 


INTRODUCTION 

In  numerous  practical  applications,  data  are  recorded  for 
observation  and  scrutiny.  For  example,  several  receiving 
elements  of  an  array  may  be  observed  over  a  time  interval  of 
interest  in  an  effort  to  detect  the  presence  of  a  source, 
estimate  its  location  and  speed,  and  characterize  some  of  its 
attributes,  such  as  source  frequency.  At  the  same  time,  there 
may  be  additional  unknowns,  such  as  the  in-phase  and  quadrature 
amplitudes  of  multiple  arrivals  at  the  receiving  elements, 
perhaps  via  direct  and/or  surface  reflection  paths. 

When  a  model  of  the  received  signal(s)  in  the  available 
measured  data  from  the  source  is  specified  or  adopted,  the  model 
generally  will  contain  some  unknown  linear  parameters  and  some 
unknown  nonlinear  parameters.  For  example,  the  in-phase  and 
quadrature  amplitudes  of  the  received  signal  components  will 
appear  linearly  in  the  source  model  waveform,  while  the  source 
location,  speed,  and  center  frequency  will  appear  nonlinearly  in 
the  particular  source  model  waveform.  The  exact  nonlinear 
functions  depend  on  the  configuration  and  geometry  of  the 
situation  of  interest.  An  example  of  a  received  signal  waveform 
is  A  s(t-t),  where  amplitude  A  appears  linearly,  while  delay  x 
appears  nonlinearly,  that  is,  inside  the  function  s(  ). 
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To  determine  the  source  characteristics,  it  is  necessary  to 
estimate  all  the  unknown  parameters  from  the  available  data; 
however,  this  can  often  be  accomplished  via  a  sequential 
approach.  For  example,  if  the  nonlinear  model  parameter  values 
are  initially  hypothesized,  an  analytic  solution  for  the 
condi tionally-optimum  linear  model  parameter  values  may  be 
obtainable  analytically  through  minimization  of  some  error 
criterion.  The  condi tionally-optimum  linear  parameters  can  then 
be  substituted  back  into  the  error  criterion,  resulting  in  the 
reduced-dimension  conditionally-minimum  error,  thereby  often 
significantly  reducing  the  size  of  the  resultant  search  problem 
for  the  best  (nonlinear)  parameter  values  of  the  model.  Reducing 
the  dimensionality  of  the  search  space  is  an  extremely  beneficial 
step  in  terms  of  the  amount  of  execution  time  required  to  find 
the  minimum  error. 

When  this  sequential  approach  to  a  least-magnitude-squares 
error  minimization  is  taken,  the  conditionally-minimum  error 
takes  on  a  Hermitian  form  with  dimension  equal  to  the  number  of 
nonlinear  model  parameters  in  the  original  error  definition.  To 
speed  up  the  search  for  the  global  minimum  in  this  (possibly) 
high-dimensional  space,  it  is  useful  to  be  able  to  compute  the 
gradient  vector  (slopes)  and  the  Hessian  matrix  (curvatures)  of 
the  conditionally-minimum  error  at  any  point  in  the  nonlinear 
search  space.  These  quantities  are  indicative  of  the  direction 
and  step  size  that  the  next  iterate  for  the  minimum  error  should 
take . 
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When  this  task  is  undertaken  here,  the  Hessian  matrix  is 
found  to  contain  a  "destabilizing"  term.  A  similar  quantity  for 
the  brute-force  minimization  of  the  original  total  squared  error 
is  presented  and  discussed  in  reference  1  (page  683,  equations 
(15.5.7)  and  (15.5.11)).  However,  that  earlier  approach  took  no 
advantage  of  the  fact  that  the  linearly-appearing  model 
parameters  can  and  should  be  eliminated  analytically,  thereby 
significantly  reducing  the  dimensionality  and  execution  time  of 
the  nonlinear  search  for  the  global  minimum  of  the  specified 
error  criterion.  The  current  modified  Hessian  matrix  is  not  as 
simple  as  that  cited  in  reference  1;  however,  once  again,  no 
second-order  derivative  terms  of  the  basis  functions  are 
involved,  thereby  reducing  the  amount  of  analytical  and  computer 
effort  required. 

The  use  of  the  gradient  vector  and  the  Hessian  matrix  of  the 
conditionally-minimum  error  is  of  limited  utility  if  the 
nonlinear  search  procedure  does  not  start  in  the  high-dimensional 
error  valley  containing  the  global  minimum.  This  observation 
strongly  suggests  that  a  considerable  fraction  of  the  search 
effort  should  be  devoted  initially  to  locating  this  correct  error 
valley,  at  least  coarsely,  before  beginning  a  possibly  wasteful 
time-consuming  fine-grained  search  for  a  local  error  minimum. 

This  report  also  addresses  the  issue  of  rates  of  sampling  in 
the  various  dimensions  of  the  nonlinear  model  variables  since 
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that  issue  directly  affects  the  amount  of  time  devoted  to  the 
initial  coarse  search  for  the  proper  error  valley. 
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ERROR  DEFINITION  AND  MINIMIZATION 

The  measured  data  set  is  presumed  to  contain  N  values: 
d  -  tdj  •••  dN]T  ,  id 

where  (dn)  could  be  complex.  These  N  (random)  data  values  could 
consist,  for  example,  of  N  spatial  samples  (array  elements)  and 
Nfc  time  samples,  in  which  case,  N  =  Ng  Nt- 


The  basis  set  of  functions  consists  of  K  known  complex 

functions  (b^x,©)}  for  k=l:K,  where  x  is  a  general  real  location 

variable  (space  and  t-ime)  at  which  the  data  are  measured;  that 

is,  data  value  d  is  measured  at  known  location  x  .  n=l:N.  The 

n  n 

Mxl  nonlinear  unknown  complex  parameter  vector  is 

6  =  [01  ”*  0m]T  •  (2) 


These  basis  functions  are  weighted  and  summed  to  form  a  fit  to 
the  measured  data  according  to 

K 


fn(6»a)  =  XZ1  ak  bk(xn,6)  for  n=1:N  ' 
k=l 


where  the  Kxl  unknown  complex  amplitude  vector  is 


T 

a  =  [ax  • • •  aR] 


(3) 


(4) 


The  fitting  function,  equation  (3),  is  linear  in  the  Kxl 
amplitude  vector  a,  but  it  is  nonlinear  in  the  Mxl  parameter 
vector  ©. 
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An  instantaneous  error  between  fit  and  data  is  defined  as 


en(0,a)  =  fn(6,a)  -  dR  for  n=l:N  , 
or,  in  vector  notation, 

e(  0,a)  =  [e^Gra)  •••  eN(0,a)]T  =  f(0,a)  -  d  . 

By  use  of  equation  (3),  the  Nxl  fitting  vector  f(0,a)  can  be 
expressed  as 

f ( ©,a )  -  [f1(0,a)  •••  fN(©,a)]T  =  B(0)  a  , 
where  the  NxK  basis  function  matrix  B(0)  is  given  by 
rb^x^©)  •••  b^x^©)! 


b  ( 0 )  =  • 


bl(XN,e)  *•*  bK(XN'0) 


(5) 


(6) 


(7) 


(8) 


Observe  that  the  (0,a)  dependency  of  f(0,a)  in  equation  (7)  has 
factored  into  the  product  of  a  matrix  dependent  only  on  0  times 
the  amplitude  vector  a. 


The  total  scalar  error  of  the  complete  fit  is  defined  as 
E(0,a)  =  e(0,a)H  e(0,a)  =  [B(0)  a  -  d]H  [B(0)  a  -  d] 

-  aH  y( 0)  a  -  aH  0(0)  -  P(0)H  a  +  dH  d  ,  (9) 

where  complex  matrices 

y( 0)  -  B( ©)H  B ( 0 )  =  y(0)H  ,  P( 0)  =  B(0)H  d  .  (10) 
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Matrix  y ( 0 )  is  KxK  and  is  positive  definite,  while  vector  0(0)  is 
Kxl.  These  are  relatively  small  matrices,  at  least  compared  to 
the  dimensionality  N  of  the  measured  data  { <3n } . 


ERROR  MINIMIZATION 

The  scalar  error  in  equation  (9)  can  be  expressed  as 

E(  0,  a )  =  [a  -  y(  0)~ 1  P(0)1H  y(9)  [a  -  yO)"1  0(0)] 

+  dH  d  -  0(  0 ) H  yO)"1  0(9)  •  (11) 

Since  matrix  y(6)  is  positive  definite  for  all  0,  the  best 
(random)  amplitude  vector  a  to  minimize  E(0,a)  is  given  by 

a(0)  =  y(0)_1  0(0)  ,  (12) 

which  depends  on  the  particular  hypothesized  vector  value  of  0. 

As  nonlinear  parameter  vector  0  changes,  this  conditionally- 
optimum  amplitude  vector  a(0)  also  varies. 

An  equivalent  form  to  equation  (12)  is 

y( 6)  a( 0)  =  0(0)  or  B( @)H  B ( 0 )  a(0)  =  B(0)H  d  ,  (13) 

which  is  recognized  as  the  normal  form  of  the  equations  that 
result  from  the  least-squares  procedure  for  the  fit  B(0)  a  ~  d. 
This  result  is  also  evident  from  the  upper  line  of  equation  (9). 
In  MATLAB  notation,  solution  a(0)  =  B(0)\d  instead  of  equation 
(12). 
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Upon  substitution  of  solution  a(0)  into  equation  (11),  the 
condi tionally-minimum  scalar  error  becomes 

E(  0 )  ■  E( 0,  a( 0) )  =  dH  d  -  0(0)H  Y(e)_1  0(0) 

=  dH  d  -  0( 0)H  a( 0)  .  (14) 

This  quantity  E(0)  is  the  minimum  total  scalar  error  for  a  given 

or  hypothesized  parameter  vector  value  0.  Vector  0  is  presumed 

real  henceforth.  E(0)  must  now  be  further  minimized  by  choice  of 

0.  This  must  be  accomplished  by  a  search  in  the  M-dimensional 

space  of  0.  A  brute-force  search  directly  on  Hermitian  form 
11  _  1 

3(0)  y(6)  0(9)/  for  its  maximum  in  0,  is  one  possible 

alternative.  However,  the  M-dimensional  search  is  often 
accomplished  more  quickly  by  using  the  gradient  vector  and  the 
Hessian  matrix  of  the  condi tionally-minimum  error  E(@),  at  least 
when  the  correct  error  valley  of  E(0)  has  already  been  located. 

GRADIENT  VECTOR  OF  E(0) 

A  partial  derivative  with  respect  to  the  m-th  component  ©m 
of  vector  0  in  equation  (2)  will  be  denoted  by  a  subscript  m, 
m=l:M.  Then,  there  follows  from  equation  (14),  scalars 

Em(0)  =  g|-  E(©)  -  -  Pm(0)H  y(0)_1  0(0)  -  P(0)H  Y ( © ) ” 1  0m(0) 
m 

+  p(0)H  y(0)_1  Ym(0)  y(0)_1  0(9)  for  m=l :M  .  (15) 
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On  the  other  hand,  consider  from  equation  (9)  the  quantity 


j|-  E( 0, a )  =  aH  Ym(0)  a  -  aH  fiJQ)  -  Pm(0)H  a  ,  (16) 

m 


where  amplitude  vector  a  still  has  a  general  (but  fixed)  value. 
Upon  now  setting  general  value  a  equal  to  the  condi tionally- 
optimum  value  a(0),  equation  (16)  yields 


30 


E(0,a) 


m 


a=a( 0) 


=  a(0)H  rm(e)  a(  0)  -  2  Re(a(0)H  Pm(0))  (17) 


=  P(0)H  y(6)  1  Ym(e)  Y(  6 )  1  0(0)  -  2  Re(p(0)H  y(6)  1  Pm(9)).  (18) 


This  result  is  identical  to  equation  (15);  that  is,  for  all  0, 
the  gradient  components  of  E(0)  satisfy 


E  (0) 
— m 


E(  0, a(  0)  )  = 


a 

ae_ 


E(0,a) 


m 


for  m=l:M 


a=a( 0) 


(19) 


where  a(0)  =  y(9)_1  P(0),  and  the  latter  quantities  are  given  by 
equation  (10).  Thus,  the  setting  of  general  a  to  the  optimum 
a(0)  can  be  done  either  before  or  after  taking  the  partial 
derivative  of  the  total  error  E(0,a)  with  respect  to  ©m. 


The  forms  in  equations  (15)  and  (18)  are  not  immediately 
useful  because  rm(0)  and  Pm(0)  have  not  yet  been  evaluated.  From 
equations  (17)  and  (19), 

E  (0)  =  aH  y„  a  -  2  Re  faH  6  1  for  m=l:M  ,  (20) 

—in  —  m  —  v“  JR/ 

where  0  has  been  temporarily  suppressed  from  the  right  side  of 
this  equation.  From  equation  (10),  however. 
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(21) 


y  =  BH  B  +  BH  B  ,  3  =  BH  d  for  m=l:M  . 

1  m  m  ra  m  m 

At  this  point,  define  the  Nxl  vectors 

c  **  B  a  for  m=l:M  .  (22) 

m  m  — 

Then,  equation  (20)  yields 

» 

E  (0)  -  aH  (b”  B  +  bh  Bm)  a  -  aH  bJJ  d  -  dH  B  a 
— m  —  m  m  —  —  m  m  — 

=  cH  B  a  +  aH  BH  c  -  c!!  d  -  dH  c  .  (23) 

m  —  —  mm  m 

Upon  restoration  of  the  0  dependence,  this  becomes,  for  m=l:M, 

E  (0)  =  a|-  E( 0,a( 0) )  =  2  Re (c  ( 0)H  [ B( 0)  a( 0)  -  d]l  .  (24) 

—m  o  o  —  V  m  —  / 

m 

An  alternative  form  is  available  from  equations  (6)  and  (7) 
in  terms  of  the  conditionally-minimum  error  vector.* 

e( 0)  fi  e( 0, a( 0) )  =  B(0)  a(0)  -  d  ,  (25) 

namely, 

E  (0)  -  2  Re  fc  ( 0)H  e( 0)1  for  m=l:M  .  (26) 

This  form  may  be  useful  computationally,  in  that  the  Nxl  error 
vector  e(0)  is  independent  of  m  and  needs  to  be  calculated  only 
once  at  each  ©  of  interest;  on  the  other  hand,  Nxl  vector  cm(0) 
depends  on  both  m  and  0. 
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A  quantity  that  plays  a  prominent  role  in  the  calculation  of 
the  gradient  of  E(0)  is  the  NxK  partial  derivative  matrix  Bm(0) 
that  arose  in  equation  (21).  From  equation  (8),  it  follows  that 

for  m=l:M.  (27) 


The  basic  derivatives  required  are  3bk(x,0)/3©m  for  k=l:K,  m=l:M, 
which  must  be  evaluated  at  each  Nxl  location  vector  x  and  Mxl 
parameter  vector  0  of  interest.  Then,  from  equation  (22),  the 
quantities 

cm(e)  =  Bm(0)  a( 0)  for  m=l :M  (28) 

can  be  evaluated.  Finally,  their  use  in  equation  (24)  allows  for 
calculation  of  the  gradient  vector  of  E(0).  Alternatively,  the 
combination  of  equations  (26)  and  (28)  yields 

5m<e)  -  2  Re(a<e)H  Bm ( e ) H  e(e))  for  m-l:M  .  (29) 


Bm(8)  *  3e-B(6)  ' 
m 


30  bi(xi'0)  •'*  90  b  (xlf0) 
m  m 


30  bl(xN'0)  **'  30  bK*XN'0) 
m  m 


The  equality  of  first-order  partial  derivatives  in  equation 
(19)  does  not  extend  to  second  order;  that  is. 


3  0  30 

m  m 


E(0,a) 


a=a( 0) 


30_  30 
m  m 


E(  0,a( 0)  ) 


For  example,  with  N=l,  K=l,  M=l,  it  follows  that 


(30) 


11 


e(01,a1)  =  bfXj,©^  -  d1  ■  a1  b  -  dj  , 

2 

E( , a^ )  =  | a^  b  —  d^ |  ,  a^  =  d^/b  , 

13(0^  =  E(©1,a1)  =  0  for  all  0^  .  (31) 


On  the  other  hand. 


E(01,a1)  =  ax  (ax  b  -  d^*  b^  +  a* 


30, 


E(01,a1)  =  2  |a1|2  |bQJ2  +  2  Re  (a1 


30 


2  E(0i»ai) 


a^a^©^ 


2  la?  be. 


=  2 


(al  b'  -  dl)  b0x' 

<al  b  -  d1)*  b^J  , 
|d1  b0^/b | 2  >  0.  (32) 


This  positive  value  is  in  contrast  to  all  the  zero  derivatives 
that  will  result  from  equation  (31)  for  all  values  of  0^ .  Thus, 
there  is  no  shortcut  at  the  second-order  level  corresponding  to 
that  in  equation  (19)  at  the  first-order  level.  To  determine  the 
Hessian  matrix  of  E(0),  it  is  necessary  to  deal  directly  with 
forms  (17),  (24),  (26),  or  (29). 


HESSIAN  MATRIX  OF  E(0) 

Consider  the  second-order  partial  derivative  of  the  original 
scalar  error  E(0,a)  with  respect  to  the  components  of  0;  namely, 
from  equation  (16), 


12 


30„  36 
m  m 


E(6,a)  =  aH  y__ ( 9 )  a  -  2  Re  faH  0 _ (0)1  for  m,m=l: 

mm  v,  mm  )  — 


M 


Substitution  of  the  optimum  amplitude  vector  a(0)  =  y(0) 
at  this  stage  then  yields 


-1 


30  30 

m  m 


E(  0, a ) 


a=a( 0) 


a(©)H  rmm(6)  a( 0)  -  2  Re  fa( 0) H  pmm 
—  mm  —  V-  mm 


.  (33) 

0(e) 


0(0)H  Y(0)  1  Ymm(6)  y(  6 )  1  0(  0 )  -  2  Re(p(0)H  y(0)  1  0mm(6))  (34) 

for  m,m=l:M.  However,  this  quantity  is  not  the  m,m  term  of  the 
Hessian  matrix  of  conditionally-minimum  error  E(0)  in  equation 
(14),  as  will  be  seen  now. 


From  equations  (17)  and  (19),  suppressing  0  on  the  right  side 
temporarily, 

Em(0)  «  j§-  E(  0, a(  0) )  =  aH  Ym  a  -  aH  Pm  -  p“  a  for  m-l:M  .  (35) 
m 

Then,  the  m,m  term  of  the  Hessian  matrix  of  E(0)  =  E(0,a(0))  is 

2  2 

!We)  *  ael-ae-  *<«>  *  W4<r  E<9'a<9>>  =  aH  a 

—  mm  mm  — 

+  aH  y  a  +  aH  y  a  -  aH  p  -  pH  a  -  aH  p  -  pH  a  .  (36) 

— m  '  m  —  —  m  — m  — m  Km  — m  —  Kmm  Kmm  —  v  / 


From  equation  (13),  however, 


v  a  -  6  ,  rm  a  +  r  am  -  |Sm  , 


and  it  follows  that 


(37) 
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for  m=l:M  . 


(38) 


r  a 
’m  — 


*m  - 


Y  *m 


Substitution  in  equation  (36)  and  simplification  results  in 


E  (0)  -  aH  y  a  -  2  Re  faH  0  1  -  2  Re  fa”  y  a  )  (39) 
—mm  —  'mm  —  V—  mmj  v— m  — ny 

for  m,m=l:M.  There  is  an  additional  (last)  term  in  equation  (39) 
that  is  absent  in  equation  (34).  Thus,  the  attempted  shortcut  in 
equations  (33)  and  (34)  is  incorrect. 


To  determine  the  correct  Hessian  matrix,  equation  (10)  is 
used  to  obtain  the  relations 


0=  b”  d  , 
m  m 


0  =  bh  d  , 

Kmm  mm 


ym  -  b”  B  +  BH  Bm  , 
mm  m 


'mm 


bh  b  +  b”  b  +  b”  b  +  bh  b 
mm  mm  mm  mm 


(40) 


Substitution  in  equation  (39)  and  simplification  yield  the  exact 
result  for  the  terms  of  the  Hessian  matrix  of  E(9)  as 


5mm(9)  -  2  Ee((Bm  (Bm  S>)  "  2  Re((B  Sm)H  <B  »„)) 

( B  a  -  d)j  for  m , m=l : M  . 


+  2  Re 


fe1 


H  H 
Bmm 


(41) 


From  equation  (13),  it  follows  that 

BH  (B  a  -  d)  =0  (Kxl  vector)  .  (42) 

Although  this  Kxl  relation  does  not  require  that  B  a  -  d  =  0 
(Nxl  vector),  it  suggests  that  Nxl  vector  B  a  -  d  will  be  small; 
in  fact,  from  equation  (25),  this  Nxl  vector  is  just  the 
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condi tionally-minimum  error  vector  e(9)  at  the  0  value  of 
interest.  The  N  component  terms  of  e(0)  can  have  either  polarity 
and  will  be  unrelated  to  the  second-order  partial  derivatives 
(B^m) »  which  are  solely  model  dependent.  Therefore,  the  last 
scalar,  that  is,  the  bottom  line  of  equation  (41),  will  tend  to 
average  out  to  zero  and  could  be  dropped  if  desired.  This 
destabilizing  term  is  very  similar  to  that  in  reference  1  (page 
683)  with  relation  to  their  simpler  least-squares  problem  in 
equations  (15.5.5)  and  (15.5.11).  Therefore,  a  reasonable 
approximation  to  the  Hessian  matrix  of  E(0)  is  afforded  by 

Sim>(9)  *  2  Be(<Bm  5>H  (Bm  5>)  "  2  Ee((B  V"  (B  %>)  (43> 

for  m,m=l:M.  This  result  applies  for  all  0.  No  second-order 
partial  derivatives  of  the  basis  functions  {bk(x,0)J  are  required 
to  evaluate  this  approximate  Hessian  matrix  (43).  Only  the 
first-order  partial  derivatives  indicated  in  equation  (27)  need 
be  evaluated. 

Before  the  exact  result  (41)  or  approximation  (43)  can  be 
used,  the  quantities  {amJ  for  m=l:M  need  to  be  determined.  From 
equation  (37),  it  follows  that 

y  a  ■  B  -  r„  a  for  m=l:M  (Kxl  vectors)  .  (44) 

— m  m  m  — 

Use  of  this  relation  in  equation  (43)  yields  an  alternative  form 
for  the  elements  of  the  approximate  Hessian  matrix  as 
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Sim’'9'  -  2  Re(<Bn,  ?>H  <Bm  2)] 

-  2  ReK  -  •>“  r-1  (P„  -  t„  a))  .  (45) 

To  summarize,  the  Nxl  data  vector  d  is  given  by  equation  (1) 
while  the  NxK  basis  function  matrices  B  and  B  are  given  by 
equations  (8)  and  (27),  respectively.  The  y  and  0  matrices  are 
given  by  equation  (10),  while  the  condi tionally-optimum  amplitude 
vector  a  is  presented  in  equation  (12).  The  partial  derivatives 
of  0  and  y  are 

0m(0)  -  Bm( 0)H  d  for  m=l :M  ,  (46) 

and 

Ym(e)  =  Bm(6)H  B40)  +  B ( 0 ) H  Bm(0)  =  rm(0)H  for  m=l :M  .  (47) 

Each  PmO)  is  a  Kxl  vector,  while  each  Ym(0)  is  a  KxK  matrix. 
Here,  K  is  the  number  of  basis  functions  {b^fx,©)},  M  is  the  size 
of  nonlinear  real  parameter  vector  0,  and  N  is  the  total  number 
of  data  points  {dnJ  to  be  fit. 
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SAMPLING  RATES  FOR  PARAMETER  VECTOR  6 


The  derivations  in  the  previous  section  for  the  gradient 
vector  and  the  Hessian  matrix  of  the  condi tionally-minimum  error 
E(  0 )  are  useful  only  when  the  0  valley  of  E(0)  containing  the 
global  minimum  has  been  located.  Otherwise,  a  fine-grained 
search  in  any  other  0  valley  yields  only  a  local  minimum  of  E(0), 
which  could  have  a  significantly  larger  value  than  the  global 
minimum  of  E(0).  Therefore,  a  very  significant  fraction  of  the 
total  search  effort  for  the  global  minimum  of  E(0)  must  be 
devoted  to  locating  the  correct  valley  in  0,  in  the  first  place. 

Consider,  first,  the  case  of  M  *  1,  that  is,  real  parameter 
vector  0  has  just  one  component  0^.  A  possible  sample  of 
condi tionally-minimum  error  E(0^)  is  depicted  in  figure  1. 
Parameter  0^  is  limited  to  observation  interval 
value  0  is  selected  as  the  starting  point  for  a  fine-grained 
search  in  0^,  the  local  minimum  at  0^  will  be  reached.  On  the 
other  hand,  if  value  0g  is  selected  as  the  initial  search  point, 
the  global  minimum  in  interval  will  be  realized  at  0^. 


E(0i) 
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To  guarantee  that  the  correct  valley  of  E(0^)  is  not  missed 
during  the  initial  coarse  search  in  0^ ,  it  is  necessary  to  sample 
in  parameter  value  0^  with  an  increment  of  the  order  of  A^,  as 
indicated  in  figure  1.  That  is,  the  Nyquist  rate  of  variation  of 
E( 0^ )  with  0^  must  be  determined  so  that  a  sufficiently  fine 
increment  A^  can  be  determined.  Too  fine  a  choice  for  A^  will 
result  in  highly-dependent  function  value  samples  for  E(0^)  and  a 
wasteful  excessive  search  time.  Too  coarse  a  choice  for  A^  can 
cause  the  proper  valley  of  E(0)  to  be  missed  entirely.  Perhaps 
the  safest  approach  is  to  compute  a  handful  of  samples  of  13(0^) 
with  some  initial  guess  for  A^  and  to  plot  the  results.  Cases  of 
too  fine  sampling  or  too  coarse  sampling  will  be  obvious  from 
this  plot,  and  a  correction  of  A^  in  the  proper  direction  can 
then  be  made.  The  object  is  to  sample  as  coarsely  in  0^  as 
possible,  without  missing  any  of  the  valleys  of  13(0^). 

For  values  of  M  larger  than  1,  0  is  a  vector  of  M  real 
components,  and  the  search,  initial  as  well  as  final,  must  take 
place  in  M  dimensions.  Perhaps  the  safest  approach  now,  during 
the  initial  coarse  search  phase,  is  to  hold  all  M  components  of 
vector  0  fixed  at  some  nominal  values  except  for  one  component, 
say  0m.  Then,  plot  E(0)  versus  ©m  for  a  handful  of  samples  in 
©m  using  some  initial  guess  for  corresponding  increment  A^  and 
thereby  ascertain  a  proper  value  for  this  particular  increment 
Am.  Repeat  this  procedure  sequentially  for  each  dimension, 
m=l:M,  until  a  complete  set  of  acceptable  increments  {Am}  has 
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been  determined.  Finally,  conduct  the  full-scale  coarse  search 
in  M-dimensional  0  using  these  increments  {AmJ  and  thereby  locate 
the  valley  containing  the  global  minimum.  Whether  this  complete 
search  can  be  accomplished  in  a  reasonable  amount  of  time  depends 
on  the  size  of  M  (the  curse  of  dimensionality)  and  the  extent  of 
search  (initial  uncertainty)  required  on  each  component  0m, 
m=l:M,  of  vector  0. 

The  nominal  values  at  which  M-l  of  the  components  of  0  are 
held  fixed  should  hopefully  be  in  a  fairly  reasonable 
neighborhood  of  the  (unknown)  true  global  minimum  of  E(0). 
Otherwise,  the  selected  values  of  increments  {A  )  could  be 
misleading,  some  being  too  large  and/or  some  being  too  small. 

Each  initial  individual  one-dimensional  plot  of  E(0)  versus  0 

Itl 

will  yield  some  information  as  to  whether  the  corresponding 
increment  Am  is  as  valid  at  one  end  of  the  plot  as  it  is  at  the 
other  end  of  the  plot.  Also,  since  the  dimensions  of  the 
parameter  vector  components  { ©m }  are  generally  different  (for 
example,  seconds,  meters,  Hertz),  the  proper  individual 
increments  {AmJ  could  be  very  different  in  magnitude.  Thus,  this 
initial  coarse  search  is  extremely  worthwhile  and  probably 
mandatory  for  efficient  localization  of  the  global  minimum. 
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QUALITY  OF  LEAST-SQUARES  ESTIMATES 


Recall  from  equation  (1)  that  d  is  the  Nxl  data  vector  of 
received  data,  counting  both  space  and  time  samples.  The  Nxl 
fitting  vector  f(0,a)  in  equation  (7)  is  replaced  here  by  f(<f>), 
where  the  new  parameter  vector  <f>  =  [0;a]  incorporates  all  the  old 
parameter  variables  and  is  of  size  Pxl .  In  this  section,  it  is 
presumed  that  the  data  vector  d,  basis  function  matrix  B(0), 
amplitude  vector  a,  and  parameter  vector  0  are  all  real. 

The  instantaneous  error  vector  is  now  e(<|>)  =  f(<|>)  -  d  and  is 
Nxl.  The  total  squared  error  is 

N 

EU)  =  e(4>)T  e(+)  =  C  e*U)  =  [dT  -  f(<f>)T][d  -  f  ( ♦)  ]  .  (48) 

n=l  n 

Let  be  a  local  minimum  of  E(<f>)  and  define  Pxl  difference  vector 
A  b  <(>  -  $.  By  holding  $  fixed,  the  total  error  can  be  expanded 
according  to 

EU)  a  E($)  +  Ge(£)t  A  +  |  at  HE(j.)  A  for  ♦  near  ♦  .  (49) 

The  Pxl  gradient  vector  GE(  is  zero  at  <f>  =  *.  HE ( )  is  the  PxP 
Hessian  matrix  of  scalar  error  E(<|>);  the  value  of  H_  required  in 
equation  (49)  can  be  calculated  once  $  has  been  determined. 

Also,  the  n-th  component  of  fitting  vector  f($)  is  given  by 
f n ( )  ■  fn($)  +  Gn(^)T  A  +  j  AT  Hn($)  A  for  $  near  $  (50) 
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and  for  n=l:N,  where 


rafnU) 

Gn<*>  - 


3fn(+) 


(51) 


is  the  Pxl  gradient  vector  of  scalar  function  f n ( ) ,  and 


p  f(<m 

V*>  -  [a»-"ayj  '  P'9-ljP 


(52) 


is  the  PxP  Hessian  matrix  of  scalar  function  fn(<|>).  From 
equation  (50)  follows  the  expansion: 


f  (♦) 


‘fj'Ur 


fi(i)l  pid)  a; 


%l±)  A 


'AA  H1(+)  A' 


A  HN( +)  A 


(53) 


This  expansion  will  now  be  employed  in  equation  (48),  namely, 


EU)  =  dT  d  -  2  f  ( <#* ) T  d  +  f(4>)T  f  ( 4> ) 


(54) 


The  first  term  of  interest  is 


£<+>t  d  -  C  d„  £„<*>  -  C  a„  £„<±>  -  C  d„  V±>T  4 

n=i  n=i  n=i 


+  I  C  ■»„  ^  Hn(i)  4  -  «0  ♦  «l  A  +  i  0T  a2  A  ,  (55) 

n=l 


where  random  variables 


C  <*„  £„(♦)  ,  «! 

n=l 


E  dn  Gnd)  £  «2 

n=I 


E]  d  H  (jH  .  (56) 
n=l 
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The  second  term  of  interest  is 


fU)T  fU)  =  rz  f^(*)  =  F2  (fn<±>  +  gu)t  a  +  \  at  hu)  a] 

n=l  n  n=l  ^  n  n  z  n  J 


(50  +  2  pj  A  +  AT  A  , 


(57) 


where  the  quantities 


N 


N 


e0  -  c  f„<±>  -  »1  -  EZ  £„(±)  S  u> 

u  n=l  n  A  n=l  n  n 


N 


(58) 


e2  -  ZZ  («n(±)  «„(♦)  +  Gn(i)  0n(*)T) 

n=l 


Substitution  in  equation  (54)  yields 


EU)  S  [dT  d  -  2  aQ  +  pQ]  +2  -  ai]T  A  +  AT  [p2  -  a2)  A.  (59) 


Since  =  <f>  is  the  location  of  a  minimum  of  E(<|>),  then 


N 


**1  "  «!  -  EH  t  fn<  -  dnl  Gn(i)  =  0  (Pxl  vector) 
n=l 


(60) 


Also, 


e2  "  “2  =  ^Zj  (iV**  "  dn]  Hn(  — }  +  Gn(^)  Gn(-*)T) 
n=l 


N 


=  n  g  (♦)  g  c ♦>  , 

n=l  n  n 


as  will  be  argued  shortly.  It  follows  that 


(61) 
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N 

EU)  =  [dT  d  -  2  «0  +  P0]  +  AT  C  Gn(<fr)  Gn(«j»)T  A 

n— 1 


=  [d  -  f(i)]T  [d  -  f(*)J  +  AT  YU  G  (£)  Gn(-^)T  A  *  (62) 

n=l 

Comparison  of  equation  (62)  with  equation  (49)  yields 


|  HeU)  £  ED  Gn(j.)  Gn(+)T  *  Q  =  q' 
n=l 


(63) 


The  rank  of  PxP  matrix  Q  is  P  because  the  number  N  of  data  points 
d  is  generally  much  larger  than  the  total  number  P  of  unknown 
parameters  4>.  This  result  affords  a  method  of  computing  the 
Hessian  matrix  HE(4>)  of  total  error  E($)  at  its  minimum  location 
<(>=♦.  It  involves  only  the  gradient  vectors  {G  (♦)},  n=l:N,  of 
the  component  basis  functions  { f n ( ♦ ) >  (see  equation  (51)).  This 
result  is  similar  to  that  in  reference  1  (pages  682  -  683). 


To  apply  these  results  to  a  practical  case,  a  particular  form 
is  presumed  for  the  received  data  d,  namely, 

d  ■  f  ( 4>  )  +  w  for  n*l:N  ,  (64) 

n  n  o  n 

where  <J>0  is  the  true  value  of  the  parameter  vector  and  {wnJ  is 
additive  noise.  Then,  equation  (60)  yields 

N 

TZ  lfn(4>o)  '  fn(  — }  +  wn]  Gn(^}  =  0  (Pxl)  *  (65) 

n=l 

However,  for  high  signal-to-noise  ratio  in  equation  (64),  is 
near  4»Q,  allowing  for  approximation 
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(66) 


W  £  fn<±>  +  Gn'i)T  Uo  -  4> 
and,  therefore,  from  equation  (65), 

N  rp 

EZ  g  u)  (Gnu)  “  ♦)  +  wn)  “  o  (pxl)  • 

n=l 

That  is. 


N 

Q  (4>  -  !)  +  LH  wn  Gn(i)  S  0 

n=l 


or 


N 


i  -  ♦„  +  o  1  rz  »„  g  (±)  . 

°  n=l  n  n 


where  matrix  Q  is  given  by  equation  (63). 


(67) 


(68) 


(69) 


The  location  ^  of  the  minimum  of  error  E(<J>)  in  equation  (48) 
is  a  random  variable;  therefore,  matrix  Q  defined  in  equation 
(63)  is  also  random.  However,  for  a  large  number  N  of  data 
points  and  high  input  signal-to-noise  ratio,  minimum  location 
will  not  fluctuate  much.  Then,  a  reasonable  approximation  in 
equation  (69)  is  that  the  dominant  perturbation  is  caused  by  the 
additive  noise  term  (wn) ,  while  the  quantities  Q  and  {Gn(4>)}  are 
relatively  constant.  Under  this  assumption,  the  mean-square 
error  matrix  of  ♦  is,  from  equation  (69), 

MSEU)  -  E{(j>  -  4>0)U  -  *0)T)  -  Q_1  EZ  Gn(±)  cnm  Gm(+)T  Q-1 ,  ( 70 ) 

n ,  m=l 
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where  c 


=  E{wn  wmJ ,  n,m=l:N,  are  the  covariance  matrix  elements 

of  the  additive  noise.  Since  the  additive  noise  covariance 

matrix  [c _ ]  can  be  measured  apriori,  and  gradients  {G  (<j>)}, 

nm  n  — 

n=l:N,  as  well  as  matrix  Q  can  be  calculated  once  the  solution 
point  for  minimum  E ( 4> )  is  found,  the  mean-square  matrix  MSE(j>) 
can  be  evaluated  at  solution  by  means  of  equation  (70). 


For  the  special  case  of  white  noise  {wnJ,  cnm 
it  follows  that 


and 


MSE  ( <f> ) 


i  N 

q-1  n 

n=l 


%  Gn(  — } 


T  -1 

Gn(i)A  Q 


2  -1  ~  -> 
aw  Q  =2 


2  -1 

a  H„(t)  1 

w  E  — 


(71) 


upon  use  of  equation  (63).  (This  result  agrees  with  reference  1, 

equations  (15.5.15)  and  (15.5.11);  however,  equation  (70)  is  a 

more  general  result  for  any  additive  noise  covariance  matrix.) 

In  particular,  the  mean-square  error  of  the  p-th  parameter 

2 

estimate  4>  is  2  a  times  the  p-th  diagonal  of  the  inverse  of  the 
— p  w 

Hessian  matrix  of  total  error  E(<{>)  at  the  solution  point 


From  equation  (63),  it  follows  that 

Q  -  C  Gn<i>  Gn(i)T  =  I  V4>  • 

n=l 


(72) 


The  first  form  requires  calculation  of  N  Pxl  gradient  vectors 
(Gn(  +  )}  of  signal  components  { f  n  ( 4> ) }  in  equation  (51),  whereas 
the  last  form  requires  calculation  of  the  PxP  Hessian  matrix  of 
scalar  error  E(<|>)  in  equation  (48).  The  end  result  in  equation 
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(72)  is  a  useful  approximation  only  for  white  noise  (wn)  and  high 
signal-to-noise  ratio  in  form  (64). 


In  equation  (61),  a  term  was  dropped,  namely, 

rz  -  v  h„u>  . 

n=l 

For  received  data  model  (64),  this  term  becomes 


C  [fnH>  -  VV  -  v  Vi)  . 

n=l 


(73) 


(74) 


For  high  signal-to-noise  ratio,  ^  will  be  close  to  <frQ,  and  this 
term  is  essentially  a  sum  of  random  noise  x  signal  terms.  (This 
is  true  for  the  general  case  in  equation  (73)  as  well.)  The 
remaining  term  in  equation  (61),  namely, 


N 


n=l 


Gn(i) 


Vi> 


(75) 


is  a  sum  of  signal  x  signal  terms.  Therefore,  for  high  signal- 
to-noise  ratio,  it  is  expected  that  this  latter  term  will 
dominate  and  that  the  approximation  in  equation  (61)  is  valid 
(see  also  reference  1,  page  683,  especially  equation  (15.5.11), 
regarding  this  topic). 
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CHANNEL  AND  WAVEFORM  CHARACTERIZATION 


TIME-VARYING  IMPULSE  RESPONSE 

Consider  a  transmitted  signal  consisting  of  a  unit-impulse 
transmitted  at  time  t^,  namely  8(t  -  t1),  and  let  the  received 
signal  at  the  channel  output  be  A  S(t  -  t 2 )>  where  time 

t2  -  f(t1)  >  tx.  ‘  (76) 

The  inverse  function  to  f  is  2,  where 

tx  =  f(t2)  <  t2  .  (77) 

Then  the  time-varying  impulse  response  of  this  channel  at  time  t, 
due  to  a  unit-impulse  excitation  at  time  t^,  is 

h(t;t1)  =  A  8 ( t  -  f(tx) )  .  (78) 

More  generally,  for  an  arbitrary  waveform  u(t)  transmitted 
through  this  channel,  the  received  waveform  is 

r  ( t )  =  J  dt1  u(t1)  hft;^)  =  A  J  dt^  u(t1)  8(t  -  f(t1)) 

=  A  J  dt2  ?'(t2)  u(f(t2))  S(t  -  t2)  =  A  2'( t )  u( ?( t ) )  ,  (79) 

upon  using  the  change  of  variable  t^  =  f(t2).  Usually,  to  a  very 
good  approximation,  f'(t)  is  constant  within  the  observation 
interval,  thereby  yielding  received  waveform 

r ( t )  S  A  u(?(t) )  ,  (80) 

where  scaling  A  absorbs  this  constant  factor.  From  equation 
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(77),  the  inverse  function  satisfies  the  rule  ?(t)  <  t  for  all 
t.  Thus,  to  find  the  received  waveform,  it  is  necessary  to 
determine  the  time  delay  function  f(  )  in  equation  (76)  and  then 
find  its  inverse  f(  )  according  to  equation  (77). 

MOVING  SOURCE  AND  MOVING  RECEIVING  ELEMENTS 

Let  the  source  be  at  location  x(t),  y(t),  z(t)  at  time  t, 
while  receiving  element  m  of  an  array  is  at  xm(t),  Ym(t),  zm^t^’ 
Then,  for  a  direct  path,  straight-line  transmission  of  an  impulse 
at  time  t^  and  speed  of  propagation  c,  it  follows  that 

c2  (t2  -  tj)2  =  [x(tj)  -  xm(t2n2  +  (y(tj)  -  ym(t2)l2 

+  [z(t1)  -  zm(t2)]2  .  (81) 

The  left  side  of  this  equation  is  the  square  of  the  distance 
traveled  by  an  emitted  impulse  between  transmission  at  time  t^ 
and  reception  at  time  t2.  The  right  side  is  the  square  of  the 
distance  between  the  source  location  at  emission  time  t^  and  the 
m-th  receiving  element  location  at  reception  time  t2 . 

To  solve  equation  (81)  for  t^  in  terms  of  t2,  it  is  necessary 
to  specify  the  forms  of  source  location  functions  x(t),  y(t), 
z(t),  as  well  as  the  location  functions  of  the  m-th  receiving 
element.  For  a  constant-depth  source  with  constant  velocity,  it 
follows  that 
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x(t)  -  x  +  x  t  ,  y(t)  -  y  +  y  t 


2  ( t )  ■  z 


(82) 


where  all  five  parameters  are  unknown.  For  a  receiving  array 
moving  in  the  direction  of  the  x-axis,  at  constant  depth  and 
velocity,  it  follows  for  the  m-th  element  that 


x(t)  =  x  +  v  t  ,  ym(t)  =  y  ,  z  ( t ) 
m  in  m  m  m 


(83) 


where  all  these  parameters  are  presumed  known  at  the  receiver. 
Substitution  of  equations  (82)  and  (83)  in  equation  (81)  yields 

c2  <t2  -  tj)2  -  (X  +  i  -  xm  -  V  t2>2 


■f  (y  t  y  tj  -  ym)2  +  (z  -  zj2  . 


(84) 


That  is, 


fcl  a  -  2  fcl  em(t2)  +  rm(t2)  =  0  , 


(85) 


where 


2  -2  *2 
a  =  c  -  x  -  y  , 


3m(t)  -  c  t  +  x  (x  -  xffl  -  v  t)  +  y  (y  -  ym)  , 

rm(t)  -  c2  t2  -  qm(t)  , 

9m(t)  -  (x  -  -  v  t)2  +  (y  -  yj2  +  (z  -  zm)2 


(86) 


Since  equation  (85)  is  quadratic  in  t^,  it  has  explicit  solution 

*  K'V  -  “  W)* 

t1«l  “  -  ' 


(87) 


where  the  negative  square  root  must  be  taken  to  ensure  that 
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tlm  -  ^2 '  T^e  dependence  °f  time  t^  on  element  number  m  is  made 
explicit  here.  Therefore,  from  equation  (77),  the  m-th  inverse 
function  is 


?m(t)  = 
m 


Sm(t)  -  (»»<*>  -  « 


for  all  m 


(88) 


At  this  point,  it  is  useful  to  generalize  and  also  include  a 
surface  bounce  path  as  well  as  a  bottom  bounce  path.  Thus,  ?m(t) 
will  be  denoted  by  dm(t)  for  a  direct  path,  by  sm(t)  for  a 
surface  path,  and  by  bm(t)  for  a  bottom  path.  The  function  dm(t) 
uses  argument  z  -  zm  in  the  last  term  of  qm( t ) ,  as  already 
indicated  in  the  bottom  line  of  equation  (86),  whereas  sm(t)  uses 
argument  z  +  zm  instead,  and  bm(t)  uses  argument  z  +  zm  -  2  d, 
where  d  is  the  water  depth.  Also,  define 


xm(t>  *  *  -  -  v  1  ' 


m 


*  -  ym  • 


( z  -  z 


m 


m 


z  +  z 


m 


for  direct  path 
for  surface  path 


Lz  +  z  -2d  for  bottom  path  ) 
m 


(89) 


Then,  by  canceling  c  terms  to  maintain  significance,  the 
discriminant  in  equation  (88)  becomes 

Dm(t)  *  (t>  -  “  Ym(t)  ‘  c2  (xm(t)  *  >  ^  *  c2  (Ym  +  Y  fc) 

-  (y  V*)  -  i  YJ2  +  (°2  -  i2  -  y2)  zm  '  <90) 


while,  from  equations  (86)  and  (89), 
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(91) 


t  + 


x  X  (t) 
m 


+  y  Y 


m 


which  is  independent  of  Zm.  By  combining  these  results,  equation 
(88)  can  be  expressed  in  its  final  form  as 


c2  t  + 


fm(t) 

m 


x  xm(t) 
m 


+  y 


Y  - 
m 


D(t) 

m 


(92) 


with  Zm  and  Dm(t)  given  by  equations  (89)  and  (90),  respectively, 
for  the  case  of  interest.  In  these  results,  there  are  no 
assumptions  about  x,  y,  or  v  being  small  relative  to  sound  speed 
c.  However,  to  order  1/c, 


Em(t)  S  1  *  I  [(V1’  +  i  ‘)  +  K  +  »  fc)  +  Zm]  '  <93> 


if  an  approximation  is  desired. 


As  a  special  case,  a  stationary  source  has  x  =  y  =  0,  giving 
exactly 

a  =  c2  ,  0m(t)  =  c2  t  ,  rm(t)  =  c2  t2  -  qm(t)  , 

V‘>  *  c2  V11  -  *»<*>  "  1  -  c  •  <94 

The  quantity  qm(t)  follows  from  equation  (86)  as 

V11  *  <*  -  Jtm  -  v  t)2  +  <y  -  ym>2  +  ■  (« 

where  Zm  is  given  generally  by  equation  (89). 
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TRANSMITTED  AND  RECEIVED  WAVEFORMS 


The  transmitted  waveform  is  taken  as 

u(t)  =  e(t  -  T)  g  cos(w  t  +  p) 

=  e(t  -  T)  [I  cos(o)  t)  +  Q  sin(w  t)]  ,  (96) 

where  I,  Q,  T,  w  are  unknown  to  the  receiver.  It  is  presumed 
that  envelope  e(t)  of  u(t)  is  known,  at  least  approximately.  The 
envelope  duration  is  L  seconds.  Also,  without  loss  of  generality, 
envelope  e(t)  starts  at  zero  at  time  t  =  0,  and  max{e(t)}  =  1. 

For  a  direct,  surface,  and  bottom  path,  the  model  of  the 
received  signal  waveform  at  the  m-th  receiving  element,  m=l:M,  is 
taken  as 

rm(t)  '  eK(t)  -  T)  [:d  C05(“  dm(t))  +  Qd  sin(“  dm<t>)] 

+  e(sJl(t)  -  t)  [i.  cos(»  «n(t))  +  Qs  sin  (w  s.tt))] 

+  e^t>m(t)  -  Tj  j^Ib  cos  +  Qfa  sin  t>m(t)j]  (97) 

where  1^,  Qd,  lg,  Qg ,  Ib,  Qb,  T,  to  are  unknown,  in  addition  to 

the  source  parameters  x,  y,  z,  x,  y  that  appear  in  {dm(t)}, 
{sm(t)},  and  (bm(t)}.  These  latter  three  sets  of  functions  are 
available  from  equations  (92)  and  (90),  when  combined  with  the 
corresponding  { Zm }  parts  of  equation  (89).  The  receiving  array 
parameters  {xm) ,  { ym } ,  { zm)  for  m=l:M  and  receiver  speed  v  are 
assumed  known,  as  is  sound  speed  c  (see  equation  (83)). 
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Let  t  =  0  correspond  to  the  beginning  of  the  first  signal 
arrival  (direct  path)  at  element  number  1  of  the  receiving  array. 
Then,  e(0)  =  0  yields  d^(0)  -  T  =  0,  or  T  =  d^(0)  <  0,  giving 

rm(t)  "  e(dm(t)  “  dl<0))  [*d  cos(“  dm(t))  +  °d  sin("  dm(t0] 

+  e(sm(t)  -  d1(0>)  [ls  co»(»  sm(t)j  +  0S  sin  (w  sm  ( t ) )  ] 

+  e^bm(t)  -  d1(0)j  cos  bm(t)j  +  Qfa  sin^w  bm(t))j  (98) 

for  m=l :M.  Unknowns  Id,  Qd,  Ig,  Qs,  lfa,  Qfa  appear  linearly, 

while  unknowns  x,  y,  z,  x,  y,  w  appear  nonlinearly  in  modeled 
received  signals  (rm(t)}  through  the  (dm(t)},  { s  ( t ) } ,  and 
{ b  ( t ) }  functions. 

The  modeled  received  data  are  sampled  at  times  t  <=  nA  for 
n*=l:N,  giving  model  data 

rm(nA)  *  e(dm(nA)  -  d1  (  A )  J  £ld  cos  dm(nA)j  +  Qd  sin(a>  dm(nA)jj 

+  e(sm^nA)  ~  d^(A)j  cos^w  sm(nA)j  +  Qg  sin^w  sm(nA)jJ 

+  e^bm(nA)  -  d1(A)j  j^lfa  cos  (w  bm(nA)J  +  Qb  sin(w  bm(nA)jj 

(99) 

for  m=l:M  and  n=l:N.  This  model  is  to  be  fit  to  the  actual 
measured  data  (pm(nA)}  at  element  number  m  and  time  nA,  by 
choosing  the  following  quantities: 
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(100) 


Id,  Qd,  Is,  Qs,  Ib,  Qb,  X,  y,  z,  x,  y,  <A>  . 

The  first  six  variables  are  to  be  eliminated  analytically,  as 
indicated  in  an  earlier  section.  This  elimination  reduces  the 
number  of  search  dimensions  from  12  to  6,  a  very  worthwhile 
procedure.  A  numerical  search  on  the  remaining  six  variables  in 
equation  (100)  is  then  required. 


When  estimates  of  x,  y,  z,  x,  y  become  available  after  data 
processing,  they  constitute  position  and  velocity  estimates  of 
the  source  at  the  time  t  <=  0  (see  equation  (82))  when  the 
direct-path  signal  first  arrived  at  element  number  1.  This 
analysis  presumes  that  the  source  kept  a  steady  course  from  the 
initiation  of  its  transmission  at  time  T  (see  equation  (96)) 
until  time  t  =  0.  The  quantity  T  =  d^(0)  is  negative  and  is 
available  from  equations  (89)  through  (92)  in  the  form 


d1(0) 


x  (x  -  xA)  +  y  (y  -  yx)  -  D^O) 

2  *~2  *~2 
c  -  x  -  y 


(101) 


with 


D1(0)  =  c2  (x  -  x^ ) 2  +  c2  (y  -  y^)2  (102) 

-  [y  (x  -  x1)  -  x  (y  -  y1 )  ] 2  +  (c2  -  x2  -  y2 )  (z  -  z^2  . 


The  array  positions  {xmJ,  { ym) r  {zml  are  fche  locations  of  the 
receiving  elements  at  time  t  =  0  (see  equation  (83)). 
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The  time  estimate  T  is  available  once  x,  y,  z,  x,  y  are 
available.  The  estimated  source  position  at  time  T  (<0)  is 

P ( T )  s  [ x  +  x  T  ,  y  +  y  T  ,  z]  .  (103) 

This  source  position  estimate  at  time  T,  namely  P(T),  is  the  only 
reliable  description  of  the  actual  source  path  because  the  actual 
source  path  could  have  deviated  from  the  straight-line  assumption 
utilized  in  equation  (82),  both  before  and  after  the  emission 
time  T.  Since  there  was  no  emission  from  the  source  before  time 
T  or  after  time  T+L,  there  is  no  information  about  the  actual 
source  path  at  times  other  than  those  in  the  time  interval 
[ T, T+L ] .  Source  position  estimates  at  other  times  are  only 
projections  based  on  the  constant  heading  assumption. 
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SUMMARY 


A  method  for  reducing  the  dimensionality  of  the  search  for  a 
minimum  in  several  parameters  has  been  derived  for  the  case  where 
some  of  the  parameters  appear  linearly  in  the  model  of  the 
observed  data.  The  remaining  nonlinearly-appearing  parameters  in 
the  model  must  still  be  searched  in  multidimensional  space.  The 
advantages  in  execution  time  achieved  by  this  approach  can  easily 
be  several  orders  of  magnitude. 

Elimination  of  the  linearly-appearing  parameters 
significantly  complicates  the  condi tionally-optimum  total  squared 
error,  resulting  in  a  Hermitian  form  that  must  be  extremized. 

For  efficient  searching  in  the  resultant  space,  the  gradient 
vector  and  the  Hessian  matrix  of  this  Hermitian  form  must  be 
calculated.  Expressions  for  both  of  these  quantities  have  been 
derived. 

The  Hessian  matrix  was  found  to  contain  a  destabilizing  term, 
just  as  there  is  for  the  standard  least-squares  approach  on  all 
the  parameters.  The  form  of  this  term,  however,  is  different 
from  that  encountered  in  the  simpler  standard  approach. 
Nevertheless,  no  second-order  derivatives  of  the  basis  functions 
need  be  computed,  thereby  significantly  reducing  the  amount  of 
computer  effort  required  to  evaluate  the  Hessian  matrix. 
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